AITopics | main character

Collaborating Authors

main character

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

OregairuChar: A Benchmark Dataset for Character Appearance Frequency Analysis in My Teen Romantic Comedy SNAFU

Sun, Qi, Zhou, Dingju, Zhang, Lina

arXiv.org Artificial IntelligenceNov-10-2025

The analysis of character appearance frequency is essential for understanding narrative structure, character prominence, and story progression in anime. In this work, we introduce OregairuChar, a benchmark dataset designed for appearance frequency analysis in the anime series My Teen Romantic Comedy SNAFU. The dataset comprises 1600 manually selected frames from the third season, annotated with 2860 bounding boxes across 11 main characters. OregairuChar captures diverse visual challenges, including occlusion, pose variation, and inter-character similarity, providing a realistic basis for appearance-based studies. To enable quantitative research, we benchmark several object detection models on the dataset and leverage their predictions for fine-grained, episode-level analysis of character presence over time. This approach reveals patterns of character prominence and their evolution within the narrative. By emphasizing appearance frequency, OregairuChar serves as a valuable resource for exploring computational narrative dynamics and character-centric storytelling in stylized media.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.05263

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.71)

Technology:

Information Technology > Graphics (0.92)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

Li, Wenyu, Jiao, Xiaoqi, Chang, Yi, Zhang, Guangyan, Guo, Yiwen

arXiv.org Artificial IntelligenceSep-30-2025

The creation of high-quality multimodal datasets remains fundamental for advancing role-playing capabilities in large language models (LLMs). While existing works predominantly focus on text-based persona simulation, Audio Role-Playing (ARP) presents unique challenges due to the need for synchronized alignment of semantic content and vocal characteristics. To address this gap, we propose AudioRole, a meticulously curated dataset from 13 TV series spanning 1K+ hours with 1M+ character-grounded dialogues, providing synchronized audio-text pairs annotated with speaker identities and contextual metadata. In addition, to demonstrate the effectiveness of the dataset, we introduced ARP-Eval, a dual-aspect evaluation framework that assesses both response quality and role fidelity. Empirical validation showing GLM-4-Voice trained on AudioRole (which we called ARP-Model) achieve an average Acoustic Personalization score of 0.31, significantly outperforming the original GLM-4-voice and the more powerful model MiniCPM-O-2.6, which specifically supports role-playing in one-shot scenarios. The ARP-Model also achieves a Content Personalization score of 0.36, surpassing the untrained original model by about 38% and maintaining the same level as MiniCPM-O-2.6. AudioRole features dialogues from over 115 main characters, 6 trained ARP-Models that role-play different characters, and evaluation protocols. Together, they provide an essential resource for advancing audio-grounded role-playing research.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.23435

Genre: Research Report (1.00)

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

20 books by female authors for Women's History Month

FOX NewsMar-22-2025, 14:00:56 GMT

These authors made history with their powerful books. March is Women's History Month, a time dedicated to honoring the powerful, inspiring and trailblazing women who have contributed amazing things to our world. What better way to celebrate this month than by diving into books written by women? Female authors have written a diverse range of books, from novels to memoirs, to science fiction and horror. Get your bookmarks ready and prepare to be captivated by these must-read books for Women's History Month. Follow an eccentric artist and her daughter through this short novel.

artificial intelligence, barnes and noble, science fiction, (18 more...)

FOX News

Country:

North America > United States > Alaska (0.05)
Asia > Vietnam (0.05)
North America > United States > New York (0.05)
Europe > France (0.05)

Genre: Summary/Review (1.00)

Industry: Retail (0.48)

Technology: Information Technology > Artificial Intelligence > Science Fiction (0.50)

Add feedback

Universal Narrative Model: an Author-centric Storytelling Framework for Generative AI

Gerba, Hank

arXiv.org Artificial IntelligenceMar-16-2025

In their survey of authoring tools for computational narrative, Kybartas and Bidarra note that "we believe that creating a standard model of computational narrative could allow different systems to interact with the same narrative, without being restricted by incompatible models and definitions. Furthermore, such a model would also facilitate research into the generation of specific story components, e.g., allowing for multiple generators and even authors to collaborate on a given narrative" [Kybartas and Bidarra [2017]]. This paper proposes such a standard: the Universal Narrative Model (UNM). We foresee that generative AI will enable a new paradigm of storytelling technologies and processes: from assisting a writer of linear media (novels, film, television, etc.) by allowing them to test out scenes and characters before committing them to a script, all the way through to real-time storytelling systems in videogames which respond to a player's agency, and countless use cases in between [Peng et al. [2024]]. The UNM is designed to service any use case in which coherent narrative structure is a consideration, and in which authorial intent and direction is privileged. In the last five years, a robust body of research has demonstrated a wide variety of potential uses for computational narrative systems powered by generative AI, and some limited commercial deployments already exist [Yang et al. [2024], Hu et al. [2024]]. With such promise, however, comes a series of challenges: technical, narrative, and ethical. The goal of the Entertainment Technology Center's "Universal Narrative Model" project was to produce the UNM as an open standard. The ultimate directive of the project was to privilege, above all else, author-centric design and functionality, setting the stage for generative workflows which extend an author's narrative intent and creativity, rather than eclipse or replace it.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.04844

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.82)

Add feedback

Which books do I like?

Rosenbusch, Hannes, Meral, Erdem Ozan

arXiv.org Artificial IntelligenceMar-5-2025

Finding enjoyable fiction books can be challenging, partly because stories are multi-faceted and one's own literary taste might be difficult to ascertain. Here, we introduce the ISAAC method (Introspection-Support, AI-Annotation, and Curation), a pipeline which supports fiction readers in gaining awareness of their literary preferences and finding enjoyable books. ISAAC consists of four steps: a user supplies book ratings, an AI agent researches and annotates the provided books, patterns in book enjoyment are reviewed by the user, and the AI agent recommends new books. In this proof-of-concept self-study, the authors test whether ISAAC can highlight idiosyncratic patterns in their book enjoyment, spark a deeper reflection about their literary tastes, and make accurate, personalized recommendations of enjoyable books and underexplored literary niches. Results highlight substantial advantages of ISAAC over existing methods such as an integration of automation and intuition, accurate and customizable annotations, and explainable book recommendations. Observed disadvantages are that ISAAC's outputs can elicit false self-narratives (if statistical patterns are taken at face value), that books cannot be annotated if their online documentation is lacking, and that people who are new to reading have to rely on assumed book ratings or movie ratings to power the ISAAC pipeline. We discuss additional opportunities of ISAAC-style book annotations for the study of literary trends, and the scientific classification of books and readers.

annotation, book rating, enjoyment, (17 more...)

arXiv.org Artificial Intelligence

2503.033

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.89)

Add feedback

$\infty$-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation

Santos, Saul, Farinhas, António, McNamee, Daniel C., Martins, André F. T.

arXiv.org Artificial IntelligenceJan-31-2025

Current video-language models struggle with long-video understanding due to limited context lengths and reliance on sparse frame subsampling, often leading to information loss. This paper introduces $\infty$-Video, which can process arbitrarily long videos through a continuous-time long-term memory (LTM) consolidation mechanism. Our framework augments video Q-formers by allowing them to process unbounded video contexts efficiently and without requiring additional training. Through continuous attention, our approach dynamically allocates higher granularity to the most relevant video segments, forming "sticky" memories that evolve over time. Experiments with Video-LLaMA and VideoChat2 demonstrate improved performance in video question-answering tasks, showcasing the potential of continuous-time LTM mechanisms to enable scalable and training-free comprehension of long videos.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.19098

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Monaco (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness

Zhao, Jiaxing, Sun, Boyuan, Chen, Xiang, Wei, Xihan

arXiv.org Artificial IntelligenceJan-14-2025

Facial expression captioning has found widespread application across various domains. Recently, the emergence of video Multimodal Large Language Models (MLLMs) has shown promise in general video understanding tasks. However, describing facial expressions within videos poses two major challenges for these models: (1) the lack of adequate datasets and benchmarks, and (2) the limited visual token capacity of video MLLMs. To address these issues, this paper introduces a new instruction-following dataset tailored for dynamic facial expression caption. The dataset comprises 5,033 high-quality video clips annotated manually, containing over 700,000 tokens. Its purpose is to improve the capability of video MLLMs to discern subtle facial nuances. Furthermore, we propose FaceTrack-MM, which leverages a limited number of tokens to encode the main character's face. This model demonstrates superior performance in tracking faces and focusing on the facial expressions of the main characters, even in intricate multi-person scenarios. Additionally, we introduce a novel evaluation metric combining event extraction, relation classification, and the longest common subsequence (LCS) algorithm to assess the content consistency and temporal sequence consistency of generated text. Moreover, we present FEC-Bench, a benchmark designed to assess the performance of existing video MLLMs in this specific task. All data and source code will be made publicly available.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.07978

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

M$^3$oralBench: A MultiModal Moral Benchmark for LVLMs

Yan, Bei, Zhang, Jie, Chen, Zhiyuan, Shan, Shiguang, Chen, Xilin

arXiv.org Artificial IntelligenceDec-30-2024

Recently, large foundation models, including large language models (LLMs) and large vision-language models (LVLMs), have become essential tools in critical fields such as law, finance, and healthcare. As these models increasingly integrate into our daily life, it is necessary to conduct moral evaluation to ensure that their outputs align with human values and remain within moral boundaries. Previous works primarily focus on LLMs, proposing moral datasets and benchmarks limited to text modality. However, given the rapid development of LVLMs, there is still a lack of multimodal moral evaluation methods. To bridge this gap, we introduce M$^3$oralBench, the first MultiModal Moral Benchmark for LVLMs. M$^3$oralBench expands the everyday moral scenarios in Moral Foundations Vignettes (MFVs) and employs the text-to-image diffusion model, SD3.0, to create corresponding scenario images. It conducts moral evaluation across six moral foundations of Moral Foundations Theory (MFT) and encompasses tasks in moral judgement, moral classification, and moral response, providing a comprehensive assessment of model performance in multimodal moral understanding and reasoning. Extensive experiments on 10 popular open-source and closed-source LVLMs demonstrate that M$^3$oralBench is a challenging benchmark, exposing notable moral limitations in current models. Our benchmark is publicly available.

evaluation, foundation, scenario, (17 more...)

arXiv.org Artificial Intelligence

2412.20718

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Memorization Over Reasoning? Exposing and Mitigating Verbatim Memorization in Large Language Models' Character Understanding Evaluation

Jiang, Yuxuan, Ferraro, Francis

arXiv.org Artificial IntelligenceDec-29-2024

Recently, Large Language Models (LLMs) have shown impressive performance in character understanding tasks, such as analyzing the roles, personalities, and relationships of fictional characters. However, the extensive pre-training corpora used by LLMs raise concerns that they may rely on memorizing popular fictional works rather than genuinely understanding and reasoning about them. In this work, we argue that 'gist memory'-capturing essential meaning - should be the primary mechanism for character understanding tasks, as opposed to 'verbatim memory' - exact match of a string. We introduce a simple yet effective method to mitigate mechanized memorization in character understanding evaluations while preserving the essential implicit cues needed for comprehension and reasoning. Our approach reduces memorization-driven performance on popular fictional works from 96% accuracy to 72% and results in up to an 18% drop in accuracy across various character understanding tasks. These findings underscore the issue of data contamination in existing benchmarks, which often measure memorization rather than true character understanding.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.14368

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.05)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment (1.00)
Media > Television (0.48)
Media > Film (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

WHAT-IF: Exploring Branching Narratives by Meta-Prompting Large Language Models

Huang, Runsheng "Anson", Martin, Lara J., Callison-Burch, Chris

arXiv.org Artificial IntelligenceDec-17-2024

WHAT-IF--Writing a Hero's Alternate Timeline through Interactive Fiction--is a system that uses zero-shot meta-prompting to create branching narratives from a prewritten story. Played as an interactive fiction (IF) game, WHAT-IF lets the player choose between decisions that the large language model (LLM) GPT-4 generates as possible branches in the story. Starting with an existing linear plot as input, a branch is created at each key decision taken by the main character. By meta-prompting the LLM to consider the major plot points from the story, the system produces coherent and well-structured alternate storylines. WHAT-IF stores the branching plot tree in a graph which helps it to both keep track of the story for prompting and maintain the structure for the final IF system. Figure 1: The WHAT-IF user interface, filled with the A video demo of our system can be found here: main character, title, and the plot of the TV show WandaVision https://youtu.be/8vBqjqtupcc.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.10582

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Afghanistan (0.05)
North America > United States > Pennsylvania (0.04)
(16 more...)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Media (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback